Voice activity detection algorithms using subband power distance feature for noisy environments
نویسندگان
چکیده
In this paper, we propose two robust voice activity detection (VAD) methods for adverse environments. A single subband power distance (SPD) feature is estimated from different wavelet subbands and further improved to be robust against noise. The first method is based on a neural network that operates on an input vector which consists of the SPD feature and its first and second derivatives. The second method is an adaptive threshold-based algorithm that employs only the single SPD feature. A statistical percentile filter based on long-term information is enhanced to estimate the noise threshold more adaptively. A performance evaluation and comparison is carried out for the proposed and state-of-the-art VAD algorithms on the TIMIT database which was artificially distorted by different additive noise types. The results show that the invented VAD methods are very robust to environmental noise and mostly outperform the standard VADs such as the ETSI AFE ES 202 050 and ITU-T G.729 B.
منابع مشابه
Improved voice activity detection combining noise reduction and subband divergence measures
Currently, new trends in wireless communications are demanding reliable human-machine interaction in real-life environments. However, there are obstacles inhibiting automatic speech recognition systems (ASR) working in noisy environments. The main difficulty is the degradation suffered by ASR systems due to a mismatch between training and test conditions. This paper shows an improved voice acti...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملC-Means Clustering Applied to Speech Discrimination
An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The proposed speech/pause discrimination method is based on a hard-decision clustering approach built over a set of subband logenergies. Detecting the presence of speech frames (a new cluster) is achieved using a basic sequential algorithm scheme (BSAS) according...
متن کاملVoice activity detection in noisy environments
The subject of this paper is robust voice activity detection (VAD) in noisy environments, especially in car environments. We present a comparison between several frame based VAD feature extraction algorithms in combination with different classifiers. Experiments are carried out under equal test conditions using clean speech, clean speech with added car noise and speech recorded in car environme...
متن کاملHard C-means clustering for voice activity detection
An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The proposed speech/pause discrimination method is based on a hard-decision clustering approach built on a set of subband log-energies and noise prototypes that define a cluster. Detecting the presence of speech (a new cluster) is achieved using a basic sequentia...
متن کامل